cqlengine: Remove deepcopy on UserType deserialization#277
Conversation
This change makes it so newly instanced UserType during deserialization isn't immediately copied by deepcopy, which could cause huge slowdown if that UserType contains a lot of data or nested UserTypes, in which case the deepcopy calls would cascade as each to_python call would eventually clone parts of source object. As there isn't a lot of information on why this deepcopy is here in the first place this change could potentially break something. Running integration tests against this commit does not produce regressions, so this call looks safe to remove, but I'm leaving this warning here for the future reference. Fixes scylladb#152
|
Do you think you could open this PR against upstream too (datastax/python-driver)? I'd like to see if they see any reason not to merge this. |
FYI, no one responded on the issue that was opened: |
|
Opened in upstream: apache#1192 |
|
There was no response from upstream people yet I'd say we will merge this performance improvement, since our testing didn't show it breaking anything. |
|
@k0machi was talking about other places there a deepcopy, he wanted to remove, not this PR
@k0machi did a manual test, with Argus calls, and shown a big improvement across the board. we don't have a specific performance suite for this driver, especially not for cqlengine. |
A before/after would hopefully suffice, showing reduced latency (I've seen elsewhere in the threads a reduction of seconds - #152 (comment) ?) |
It doesn't quite show the latency numbers, but this request went down from ~9s avg to 750ms average |
Thanks - this is what I was hoping to see. Flamegraphs are great, but perf results are more important. |
fruch
left a comment
There was a problem hiding this comment.
Based on the information we do have, I think it's safe enough to merge
And it improve perf in quite a lot
|
I believe we have a release pending this week - if so, I'd wait after the release, instead of sending it in the last minute. |
|
@avelanarius - can we merge it? The lack of response from usptream is disappointing, but should not hold us back. |


This change makes it so newly instanced UserType during deserialization
isn't immediately copied by deepcopy, which could cause huge slowdown if
that UserType contains a lot of data or nested UserTypes, in which case the
deepcopy calls would cascade as each to_python call would eventually clone
parts of source object. As there isn't a lot of information on why this
deepcopy is here in the first place this change could potentially break
something. Running integration tests against this commit does not produce
regressions, so this call looks safe to remove, but I'm leaving this
warning here for the future reference.
Fixes #152